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Abstract 

In this paper, we consider a scenario where an eavesdropper can read the content of messages transmitted 
over a network. The nodes in the network are running a gradient algorithm to optimize a quadratic utility 
function where such a utility optimization is a part of a decision making process by an administrator. We are 
interested in understanding the conditions under which the eavesdropper can reconstruct the utility function 
or a scaled version of it and, as a result, gain insight into the decision-making process. We establish that if 
the parameter of the gradient algorithm, i.e., the step size, is chosen appropriately, the task of reconstruction 
becomes practically impossible for a class of Bayesian filters with uniform priors. We establish what step-size 
rules should be employed to ensure this. 
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1. Introduction 

In recent decades, tremendous advances in the areas of communication and computation have facilitated 
the construction of complex systems. The design and analysis of these systems involve solving large optimiza¬ 
tion problems. Utility maximization, optimal flow, expenditure minimization, and traffic optimization are 
examples of such problems. Due to the size of these problems, it is often required that problems are solved 
over a network of interconnected processors. In many scenarios, the implementation of the solution to the 
optimization problem is in the public domain. However, from an operational point of view, it is important 
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that the way that the decision is made remains confidential. In other words, while the optimal decision can be 
known by everyone, the utility function itself should remain confidential. Portfolios in portfolio optimization 
and local utilities in resource allocation can be considered as examples of such utility functions that need 
to be kept confidential. This especially becomes an issue as more computations related to operating the 


critical infrastructure (e.g. power distribution networks) are carried out in the cloud [l|. The importance of 
confidentiality, int^rity, and availability are well understood in the security of data and ICT services and 
cloud computing [Sj. In these settings, confidentiality corresponds to ensuring the non-disclosure of data, 
integrity is related to the trustworthiness of data, and availability is concerned with the timely access to the 
data or system functionalities. 

In this paper, we mainly focus on the question of confidentiality—particularly, the confidentiality of the 
utility functions even when the security of the network is compromised and an eavesdropper can listen to 
all the information being exchanged over the network during the course of solving the optimization problem. 
We consider scenarios where the utility function has a quadratic form. Specifically, the following question is 
answered: when is it possible to reconstruct a utility function, or a scaled version of it, via having access to the 
iterations produced by an iterative method? The iterative method considered in this paper is a gradient ascent 
algorithm. The choice of a gradient algorithm is inline with the recent observations that cast a favourable 
light on employing first-order methods to solve very large optimization problems P]. Note that the choice 
of quadratic programs is not very restrictive as trust-region optimization techniques allow us to solve any 
general optimization problem using a sequence of constrained quadratic programs recursively 

The problem that is addressed here is related to the one considered in the context of differential privacy 


B 


and, to a larger extent, the application of differential privacy in optimization 0, 0, S]- However, it is 
important to note that there, the price for guaranteed confidentiality is paid in terms of data integrity and 
the accuracy of the solution. To ensure differential privacy, it is known that the information passed between 
the processing nodes at each step of the optimization algorithm should be perturbed by a random variable 
from a Laplace distribution Q. This results in the algorithm not yielding an accurate solution. Here, we 
argue that the confidentiality of the objective (but not the solution) can be guaranteed in practice with no 
impact on the accuracy of the solution, if the algorithm parameter (the step size) is chosen appropriately, 
i.e., it is picked randomly from a sufficiently large set of suitable step sizes. In addition to differential 
privacy, other notions of privacy in optimization and machine learning have recently been pursued, e.g., 
see [lOl lUl li21 • Note that, in this paper, we are not directly contributing to the privacy-preserving literature, 
per se. Our main objective is point out that, in the setups discussed, one does not need to worry about 
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privacy since estimating the underlying parameters is practically impossible due computational restrictions 
(at least, with current technologies). Note that this problem is also related to the system identification and 
the parameter estimation literature, where the aim is to extract the parameters of the underlying utility or 
dynamics. However, in our setup, the eavesdropper cannot inject proper reference signals to fully probe the 
system (that is commonly known as the persistent excitation and is necessary for achieving the estimation 
131). Finally, in [l^, the agents use their actions to learn about the strategies or the utilities of the 


objective 


other agents to subsequently devise optimal strategies. However, in that study, the computational aspects of 
the problem were largely unexplored and only linear programs were considered. 

The outline of this paper is as follows. In the next section, the problem that is considered in this paper 
is formulated. In Section |3l we consider the case where the eavesdropper has access to the iterates that are 
generated during the course of solving an unconstrained quadratic program. In this section, different choices 
of the step size are considered and conditions for which the utility function cannot be constructed successfully 
are discussed. Next, in Section^ we consider the case where the problem is constrained. Concluding remarks 
are given in Section [5] 


1.1. Notation 

The sets of reals, nonnegative reals, integers, and nonnegative integers are, respectively, denoted by M, 
]R>o, Z, and Z>o. The rest of the sets are denoted by calligraphic Roman letters, such as M. Specifically, 
<S" is defined to be the set of symmetric positive-definite matrices in We define vec : —>■ R"™ 

to be a vectorization operator that puts all the columns of a matrix into a vector sequentially. Finally, we 
use H ® H to be the Kronecker product of matrices A and B. 


2. Problem Formulation 

Consider the following optimization problem; 


max 

- ^x^Qx - q^x, 

(la) 


2 

S.t. 

Cx < d, 

(lb) 


where Q S 5+, q € R", C S R™^”, d S R"*, and X = {x & \ Cx < d} ^ The optimization problem ([T]) 

is solved by an administrator over a network via an optimization method, given by 

x[k+l]= Bix[k]), x[0] G X. (2) 
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Throughout this paper, we assume that F{-) is the gradient ascent algorithm in which different step-size 
selection methods can be used. This assumption is, partly, motivated by favourable results on first-order 
methods for solving large-scale optimization problems P]. However, this assumption is also in place to 
greatly simplify the proofs and the presentation. 

Remark 1. At first glance, the update rule in ([2) appears to he a centralized implementation. However, 
distributed algorithms using primal decomposition as well as the inner problems for distributed algorithms 
using dual decomposition (see Qy; can he both rewritten, albeit in an aggregated form, in the form of (El- 

Remark 2. The results presented in this work, at least in part, are applicable to more general utility functions, 
e.g., logarithmic functions. However, the selection of the quadratic utility functions results in linear operators 
that greatly simplifies the proofs. Moreover, the quadratic utility functions, although partially conservative, 
have many applications and are widely used in signal processing, e.g., weighted least squares, and maehine 
learning, e.g. support vector machines (SVM) \l\f . 

The measurement model of the eavesdropper is as follows. For any two consecutive measurements of the 
optimization variable x[k\ and x[k + 1 ], for some k S Z>o, the eavesdropper can construct a measurement of 
the form 


y[k] = x[k] — x[k + 1]. (3) 

Therefore, at time step A: -I-1, the eavesdropper has access to measurement pairs (cc[t], y[t])f^Q. Providing the 
solution to the following problems is of interest. 

Problem 1 (Utility Function Reconstruction). Assuming that the eavesdropper can measure x[k] for all k 
and the values of A and b are known, under what conditions on the step size selection of the gradient descent 
algorithms can the eavesdropper estimate {Q, q) such that Q = jQ and q = ^q for some 7 > 0 ? 

Solving the problem above enables the eavesdropper to determine the way that decisions are made. 
For example, it can be determined which variable has a bigger impact on the solution of the optimization 
problem O- Hence, it is not necessary to exactly estimate 7 . 

Remark 3. In this paper, we assume that the communication is carried out over real and noiseless channels. 
Alternatively, one may consider the effects of quantisation and noise on the utility reconstruction problem. 
However, this is beyond the scope of this paper. 

Finally, we have the following standing assumption. 
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Assumption 1 . The parameters {Q,q) G Q C 5” x R” are randomly generated according to the non¬ 
degenerate probability density function p : Q ^ R>o- Further, we assume the distribution of {Q,q) is 
independent of the initialization of the algorithm a;[0], which is uniformly selected from {x\x^x < 1}. The 
eavesdropper knows these probability distributions. 

In the following sections, first we consider the case where the problem o is unconstrained, and then we 
study the constrained case. Note that the choice of unconstrained quadratic programs is not restrictive. We 
can solve any general optimization problem using a sequence of constrained quadratic programs recursively 
using trust-region optimization techniques. Further, when using primal-dual techniques, we can solve a con¬ 
strained quadratic program recursively using a sequence of unconstrained quadratic programs. Alternatively, 
as also discussed in Section 01 we can use logarithmic barrier functions to solve a constrained quadratic 
program. 

3. Unconstrained Case 

In the case where the optimization problem is unconstrained, the gradient iterations are such that 

x[k1] = x[k]—a[k]{Qx[k]q), (4) 

where a[k] is an appropriately selected step size (e.g., it is well known that if a[k], Vfc G Z>o, belongs to an 
appropriately selected interval on the positive reals, the iterations in (|1]) converge to the optimal solution Q). 

Remark 4. As remarked earlier, in this paper our primary interest is in the case where the gradient iterations 
are implemented in a distributed manner. For instance, if we employ n processors, each processor needs to 
follow the update rule 

Xi[k-i-l] = {l-a[k]qu)xi[k]-'^a[k]qijXj[k]-a[k]qi, 

where Xi[k] is i’th element of the deeision vector x[k]. To implement this update rule, the processors need to 
communicate the elements of the deeision vector over the directed graph Q with vertex set Vg = {1,..., n} and 
edge set Eg = {(z, j)|I <i^j< j,qij ^ 0}. The messages that the processors pass contain Xi[k], \fi, and by 
observing these messages, the eavesdropper can obtain {x[t])t^z^g. As a viable avenue for future research, we 
can consider the scenario in which the eavesdropper can only listen to a subset of the transmitted messages. 

In this scenario, the eavesdropper measurement model given by ([3|) becomes 

y[k] = a[k]{Qx[k] + q). 
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As before, at time step fc + 1, measurement pairs {x\t],y[t])^^Q are available to the eavesdropper. We make 
the following standing assumption. 

Assumption 2. The vectors (a;[t])"AQ^ are linearly independent. 

We formally characterize the conditions for which Assumption [5] holds in the course of this paper for each 
scenario. 


Remark 5. Under Assumptions^ the independence assumption is without loss of generality (i.e., Assump¬ 
tion I holds almost surely). Note that when using the gradient ascent algorithm, for the measurements to he 
dependent, the algorithm should be initialized at a point along one of the eigenvectors of Q, and q should be 
parallel to that specific eigenvector. 

Remark 6 . Let us briefly explain why Assumptions^ is necessary. In this remark, we assume that a[k], 
k gN, is known. Note that this shows the necessity of Assumptions^ as it discusses a more restrictive setup 
(because the eavesdropper has access to more information). In such case, we have 


Qx[k] +q = Yec[Qx[k] + q) 

= Yec{Qx[k]) + q 
= ® 7)vec(Q) + q, 


where the last equality follows from Item (5) in j^ . p. 97]. This gives 


/ 

x[0]^ 

1 

®I 


vec(Q) 


j/[0]/a[0] 


x[kY 




q 



V 

1 

) 




y[k]/a[k]_ 


( 5 ) 


Let us denote the matrix on the left hand side of dSD by G. To avoid admitting redundant equations, G should 
have a full row rank. From the properties of the Kronecker product fl^ . p. 58], if AssumptionSM holds, for all 
k < n — 1, we know that 



/ 

a:[0]^ 1 



/ 

a;[0]^ 1 



rank 



(g)/ 

= rank 




rank(/) = (fc + l)n 


1 

1 

1—1 

h 

H 

_1 



1 

x[k]^ 1 

} 



The number of rows of G is also equal to {k + l)n and, hence, G has full row rank. Consequently, there is 
no redundant equation. 
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3.1. Constant Step Size 

In this subsection, we assume a[k] = a > 0 for all k G Z>o. In this case, it is not possible to reconstruct 
Q, q uniquely because a shows itself as a scaling factor in these matrices. In other words, for 7 in Problem[T] 
we have 7 = I/a. Let us construct the set M.[k] = {{Q', q') G 5" xK" | y[t] = Q'x[t]+q',\/t = 0,..., k}. This 
denotes the set of parameters that are consistent with our measurements up to time step k + 1 (note that 
for constructing iy[t])t^Q, the eavesdropper needs to measure Evidently, by construction, these 

sets are nonexpansive, i.e., + I] C AI[/c] for any k G Z>o. First, let us present a condition under which 

Assumption [5] holds. 


Remark 7. The introduced estimator Jli[k] has close connections to the idea behind set-membership iden¬ 
tification in which the set of permissible parameters are grad ually reduced by removing realisations that are 
not compatible with the newly received measurements; see 


Lemma 1. Let the distribution p governing Q (cf. Assumption [7]) be such that the algebraic multiplicity of 
every eigenvalue of Q is almost surely equal to one. Then, are almost surely independent. 


Proof. Notice that 


x[k + 1] = x[k]—a{Qx[k] + q) = {I—aQ)x[k]—aq. 


Therefore, we have 


1 

h 

0, 

1 _ 


1 

h 

0, 


0l Xn 

a:[l]T 

= 

a;[0]^(/—aQ)^ 

- 

{aqY 

x[n — 1]^ 


a;[0]^(/-aQ)("-i)"^ 


E"=o {I-aQy^ _ 


Since a;[0] is selected randomly and independently from the pair (Q, q), the iterates {x[f\)'l^Q are independent 
(i.e., the matrix on the left-hand side of ([6|) is full rank) if 


rank 


a:[0]^ 

a;[0]^(/-aQ)^ 


= n, 


(7) 
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which is equivalent to that the pair ((/—aQ)"*") 2 ;[ 0 ]~'') is observable. From the controllability/observability 


literature 
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p. 123], we know that Q holds if and only if 


rank 


(J-aQ)*'' - fJ-I 

x[0]^ 

for all eigenvalues fjL of {I—aQ)^. This condition is satisfied if (i) the algebraic multiplicity of all the 
eigenvalues of {I—aQ)^ is equal to one [h, p. 59] and {ii) a;[0] is not an eigenvector of {I—aQ)^ (which is 
satisfied with probability one since a:[0] and Q are drawn independently). This is, in turn, satisfied if the 
algebraic multiplicity of every eigenvalue of Q is equal to one. □ 


The following theorem shows that after enough measurements, the set M.[k] becomes a singleton. 
Theorem 1. Let n > 3. Then M.[k] = {{aQ,aq)} for all k > [(n + l)/2]. 

Proof. Because Q' is a symmetric matrix, it only has (n^ + n)f2 unknowns (= l + 2 + -- - + n). Therefore, the 
eavesdropper needs to calculate (n^ + 3n)/2 unknowns corresponding with the entries of Q' and q'. Due to 
Assumption]!! if the eavesdropper collects measurements for up to fc = [(n +1)/2] (notice that, by definition, 
the number of measurements in M[k] is equal to A: + 1), the set of linear equations defining Ad[fc] admits 
a unique solution, (Q,g), where Q = aQ and q = aq. To be able to use Assumption [2l we should have 
[(n + l)/2] = fc < n — 1 based on Remark[6l which gives n > 3. □ 

Remark 8 . If the eavesdropper collects k < \{n + l)/2] measurements, the set Ad[A:] has infinitely many 
elements. As a result, if the iterations are terminated at k < ]"(n + l)/ 2 ] iterations, it is impossible for the 
eavesdropper to reconstruct the parameters of the utility function. However, the confidentiality is guaranteed 
here at the price of getting a possibly inaccurate solution. 

Remark 9. For n < 3, regardless of the number of collected measurements, Af [fc] never becomes a singleton. 
This is due to the fact that Assumption\^ does not hold any more. 

Remark 10. The presented estimator also works if the step sizes are selected as a[k] = c/k^ for all k G Z>o 
and for a fixed 6 G (1/2,1]. This is true because we can always scale the measurements {x\t\,y\t])^^Q to 
{x[t],t^y[t])f^Q and, subsequently, use the presented results for the constant step size. If 6 is not known by the 
eavesdropper, we can construct a filter to also reconstructs based on the measurements {x[t],y[t])^^Q, however, 
this would make the problem considerably more difficult because of the complexity of the Bayes updates in this 


case. 






This remark shows that the choice of a time-varying step size as described above does not make the 
reconstruction of the utility function any harder than the constant step size case so long as 6 is publicly 
known. 


3.2. Random Step Sizes Drawn from a Finite Set 

Next we consider the case where the step sizes a[A:], for k G Z>o, are drawn uniformly from a set A = 
..., of s distinct values. Moreover, we assume that the step sizes (a[fc])fegz^Q are independently 
and identically distributed over time. First, let us present a condition for which Assumption [5] holds. 


Lemma 2. Let Q be such that the following condition is almost surely satisfied 


rank 


a;[0]^ 

x[0]^(/-a[0]Q)^ 


\ [a3[0] ' iI-a[0]Qy ■ ■ ■ {I-a[n - 2]Q) ' J 


\ 


= n. 


( 8 ) 


Then, {x[t]y^Q are almost surely independent. 

Proof. Similar to the proof of LemmalU notice that x[k -f 1] = {I—a[k]Q)x[k]—a[k]q. Therefore, we have 


1 

h 

o, 

1 _ 


1 - 

h 

o, 

1 _ 

cr[l]T 

= 

x[0]T(/-a[0]Q)^ 

x[n — I]'*' 


a:[0]^(/-Q;[0]Q)"'" • • • (I-a[n - 2]Q)^ 


0 l Xn 

{aqV 


(9) 


(«bi9)^a-«b' + i]Q)^ ■ • • - 2]Q)^. 

Since a;[0] is selected randomly and independently from the pair (Q, q), the iterates {x[t])'lZQ are independent 
if the condition in the statement of the lemma holds. □ 


Remark 11. Condition (1^ is intimately related to the observability of time-varying linear systems 
((/—a[fc](5)^, a:[0]^); see \22 . p. 462]. 
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At iteration fc + 1, after observing {x[t],y[t])f^Q, the eavesdropper may use the Bayes’ update rule, con¬ 
secutively, for generating the conditional density function 


PiQ',q'\{x[t],y[t])t^o) (xp{y[k]\Q',q',{x[t],y[t])tlQ,x[k])p{Q',q’\{x[t],y[t])tIo,x[k ]) 
= Piy[k]\Q',q',ix[tly[t]) f ,x[k])p{Q',q'\{x[t],y [t]) frj) 

= Piy[k]\Q\ <?', x[k])p{Q', q\{x[t],y[t])'^lQ ) 


X! P(.y[k]\Q\q\x[k],a')p{a') 


a'eA 


piQ\q'\{x[t],y[t])t^^ 


^ p{y[k]\Q',q',x[k],a') 

a'GA 


p{Q',q'\{x[t],y[t])^,I^) 


(10a) 

(10b) 

(10c) 

(lOd) 


where (llOaD follows from indeoendence of the cost oarameters (O.a) and x[k] given {x[t],y[t])t=o (note 
that x[k] is merely the negation of the summation of all {y[t])fZo plus a;[0] and, hence, it is redundant 
information), (llObI) follows from independence of y[k] from {x[t],y[t])^~Q given x[k\ and the cost parameters 
(Q, q), (llOcI) follows from conditioning on a, and (llOdI) follows from the uniform distribution of the step sizes. 
Now, note that 


X! piy[k]\Q',q',x[k],a') = 
a'GA 

= 1 


1, 3a" G A : y[k] = a"{Q'x[k] + q'), 
0 , otherwise, 

(Q',i3')el5(x[fc],i/[fc])i 


whereX>(a::[A:], j/[fc]) = {{Q',q') G S"xM." | 3a' G A : y[k] = a'((3'a:[A:]-|-g')}. Hence, we havep((5', g'l {x[t],y[t])'l^Q) oc 
^{Q',q')&X){x[k],y[k])PiQ',q'\{x[t],y[t])f~g). Using induction, we can show that 


r k 


p{Q',q'\{x[t\,y[t])'l^o) oc 
Now, we can redefine 


t=0 


(Q',i3')6l5(x[t],y[t]) 


p{Q',q) — '^iQ',q')&n^^oV(xit],yit])P{Q\q')- 


M[k\ = i3^^QV{x\t\,y[t\) = {{Q' ,q') G x K" | 3a'[t] G A'. y\t\= a'[t]{Q'x[t\ 3- g'), Vt = 0,..., /c}. 

Therefore, Bayes’ rule gives p((3', g'|(a;[t], ?/[t])jLQ) = l(Q',q')eAi[fe]p(Q', g')- The next theorem shows that the 
set M.\k] becomes a singleton and, hence, the Bayesian filter converges to the correct parameter selection. 

Theorem 2. Letn>5. Then Ai[k] = {{Q, q)} for all k > \ {n + 3) / 2~\. 

Proof. Let us enumerate all the possible sequences of the step sizes A^^^ for each k. Now, for any se¬ 
quence of step sizes (a'[t])J’^Q G A^^^, the consistent parameter sets are given by the set-valued mapping 
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Z[k]{a'\t])^^Q] = {{Q'x[t] + q') = y[t]/a'\t],\/t = 0, Evidently, Z[k] {a 'is either a singleton 

or the empty set if (n + l)/2 < k. This is the case because for the mentioned horizon length, given the 
sequences of the step sizes, the number of the equations (which we assumed are not redundant; see Assump¬ 
tion mi is larger than or equal the number of free variables. Now, if we get only one more measurement, i.e., 
k = [(n -f 3)/2], only one of these sets remain nonempty and that points to the true parameters. To be able 
to use Assumption [21 we should have [(n -I- 3)/2] = fc < n — 1, which gives n > 5. □ 


Note that the estimator constructed in the proof of Theorem [5] relies on the fact that we can enumerate 
all the possible sequences of the step sizes for A: = |"(n -|- 3)/2] and solve a set of linear equations for each 
one to extract the true parameters. The number of all the possible sequences of the step sizes is equal to 
gr(n-i-3)/2] _ Thus, this estimator is practically implementable only for relatively small s and n. 

However, this problem can be fixed with a simple change of variable. To do so, we can alternatively define 
the set of utility functions consistent with the observations as 


M[k\ = {{Q',q') G X R" I 3p[t] G ..., : /3[t]y[t] = Q'x[t] + g', Vt = 0,..., k). 


Following the same line of reasoning as in Remark [51 we can see that the elements of Af [fc] are the solutions 
of the set of equations 


a;[0]^ 1 


x[k] 1 

We may rewrite this set of equations as 


\ r 


a;[0 ]''"®J / j/[0] 

x[k]^(S)I I 0 


vec((5') 


ylk] 


:= $ 


y[0]/3[0] 

y[k]P[k] 


vec(Q') 

q' 

/3[0] 


/3[fc] 


= 0 . 


Using the arguments of Remark [6] and Lemma [T1 we may observe that $ is a full row rank matrix. Now, 
note that the number of unknown decision variables here is fc -I- n{n -I-1)/2 -|- n and the number of equations 
is (fc -I- l)n. Therefore, for this set of equations to have a unique solution (up to a scaling), we need to 
have fc -I- n{n -|- l)/2 -|- n < (fc -I- l)n which means fc > n{n + l)/(2(n — 1)). This is satisfied if we select 
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k > |"(n + 3)/2] as recommended in Theorem [5J Therefore, with this change of variable, we can reconstruct 
the utility function based on the observations in polynomial time. 

Remark 12. Note that, in this subsection, we did not use the fact that A is a finite set. Therefore, the 
presented argument is also valid when the step sizes are selected from closed interval of the positive reals. 


3.3. Agent-Dependent Step Sizes 

In this subsection, we assume that each entry of the decision variable x[k] is updated by an agent that can 
select its step size independently. In this case, the gradient iterations for each entry of the decision variable 
becomes 

Xi[k + 1] = Xi[k] - ai[k] ( qiiXi[k] + ^ qijXj[k] + gi j , 

^ jAi ^ 

where ai [k] are independently and identically distributed discrete random variables selected with equal prob¬ 
ability from the set A. Let ai[k] and aj[k] be statistically independent \i i ^ j. As a result, we get 


x[k -f 1] = x[k\ — A[k\[Qx[k\ -I- q), 


where 


A[k] 


Q!l [fc] 

0 


0 

an[k] 


We may define as the set of diagonal matrices of size nx n with entries belonging to A. By definition, 

A[k] G for all fc > 0. Note that we can guarantee the convergence of the gradient algorithm even when 

using agent-dependent step sizes; see [Appendix A| for more information. Similar to the previous subsections, 
let us define 


y[k] = A[k]{Qx[k] + q). 

As before, at time step A: -I- 1, measurement pairs {x[t],y[t])^^Q are available to the eavesdropper. Following 
the same line of reasoning as in the previous subsections, at iteration fc -|- 1, the eavesdropper may use the 
Bayes’ update rule, consecutively, for generating the conditional density function 

p{Q'.q'\{x[t\,y[t\)t^Q) oc l(Q'.,')ei5(a;[fc],y[fc])P(Q',g'|(a:W,yM)f=J)- (11) 
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where 'D{x[k],y[k]) = {{Q'^q') £ 5!,!; x R" \3A' £ : y[k] = A'{Q'x[k] +< 7 ')}- Hence, using induction, we 

get 


P{Q',q'\{x[tly[i\ft=o) oc 


■ k 

n '^{Q' A)&'D{x[t],y[t]) 

Lt=o 


p{Q',q') 


^{Q' ,q')GMlk]P{Q', q), 


where 


M[k] = nto'I^{x[t],y[t]) = {(Q', q') £ R"X’^ X M" | 3A'[t] £ : y[t] = A'[t]{Q'x[t] + q')At = 0,..., fc}. 

The next theorem shows Bayesian filter is always inconclusive. 

Theorem 3. The cardinality of the set is uncountably infinite for all k £ Z>o. 

Proof. Firstly, note that we can redefine M.\k\ as 

M[k\ = At^V{x[t\,y[t\) = {(Q', q') £ x | 3B[t\ £ 6”^" : B[k]y[t\ = Q'x[t\ + g', Vt = 0,..., fc}, 

where denotes the set of all diagonal matrices of size n x n with diagonal entries belonging to S = 

..., 1/a^'*^}. This way, as described in the previous subsection, we need to solve a linear set of 
equations to find the entries of the set A4[k]. Now, note that, in each iteration, we receive n new measurements 
while adding n new variables (i.e., step sizes). Therefore, even if all the measurements are independent, 
there is not a unique solution (since the number of unknowns is always strictly larger than the number of 
measurements). □ 

Theorem [3] shows that no matter how many measurements the eavesdropper gathers, it is impossible to 
reconstruct the cost function. 


4. Constrained Case 


For the constrained optimization problem, we add the constraints using logarithmic barrier functions. In 
that case, we get the unconstrained optimization problem 

1 


max --^x^Qx — q^ x + X'y^\og{di — Cjx), 


i=l 


where Ci £ R^^" is i-th row of the matrix C and A > 0 is a scaling factor. As A approaches zero, the solution 
of this problem converges to the solution of the original constrained optimization problem 23], p. 566]. When 
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using the gradient algorithm, in this case, we get 


x[k + 1] = x\k]-a[k] ^Qx[k] + 9+ ^ d - C x[k ] ^^j ’ ^ 

Therefore, an eavesdropper that listens to consecutive iterations can construct the measurements 


/ ™ A 

In the reminder of this section, we assume that the eavesdropper knows the parameters of the constraints 
(C, d) and only wants to estimate the parameters of the utility function and the scaling factor. 



Remark 13. In many problems, the constraints are enforced by the physical characteristics of the problem 
and are hence public knowledge. This is different from the utility function that is typically motivated by the 
internal mechanism of the system and the priorities of its operator and is hence kept private. 


Let us consider the case that the step sizes are selected randomly and uniformly from a finite set A = 
..., as in Subsection 13.21 the proofs for the other cases are not different. 

Assumption 3. The parameter A S A C M>o is randomly generated according to the probability density 
function p : A ^ K>o- Assume that a:[0] is chosen uniformly at random from {x\Cx < d}. Further, the 
distribution of A is independent of the initialization of the algorithm a;[0] and utility function parameters 

{Q.q)- 

Similarly, we can present the following condition for the satisfaction of Assumption [2l 


Lemma 3. Let Q be such that the following condition is almost surely satisfied 


rank 


a:[0]T 

a;[0]^(/-a[0]Q)’'" ■ • • {I-a[n - 2]Q) 


\ 


= n. 


Then, (a;[t])"^Q^ are almost surely independent. 


Proof. The proof follows from the same line of reasoning as in Lemma [21 The only difference is that, in this 
case, the right-hand side of ([21) admits additional nonlinear terms that are multiplied by A. However, since A 
is selected independently of the parameters and the initial condition (see Assumption [3]) , these terms almost 
surely do not contribute to the rank of the matrix on the left-hand side of ([9|) . □ 
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At iteration fc + 1, after observing {x[t],y[t])f^Q, the eavesdropper may use the Bayes’ update rule, con¬ 
secutively, for generating the conditional density function 


p{Q',q',X'\{x[t],y[t])Lo) = 


a'eA 


p{Q\q',X',Mtly[t])U) 


'-a'&A 


p{Q',q'.>^\{x[tly[t])U)- 

Similarly, we have = l(Q', 9 ',A')ei 5 (a;[fc],y[fc]), where 

V{x\kly[k]) = I (Q', g'. A') e X R" X A I 3a' e ^ : y[k] = a' [Q'x\k] + q'+X' ^ ) |. 


Again, using induction, we can show that 

k 


ni 


{Q' ,q' ,X')^V{x[t],y[t]) 






piQ\q\>^'\{x[t],y[t])t^o) oc 
Now, we can define 
M[k] = njLoP(a;[t],y[t]) 

= I (Q', g'. A') g 5" X R" X A I 3a'[t] £ A : y[t] = a'[t] (Q'x[t] + g'+A' ^ ^ , Vf = 0,..., fcj, 


which gives p((3', q', A'|(a:[f], 2 /[t])^^o) = ^{Q' ,q' ,\')&Mlk]PiQ\ q')p{^')- The next theorem shows that the Bayesian 
filter converges to the correct parameter selection. 


Theorem 4. Let n > 6 . A4[k] = {{Q, q, X)} for all k > [(n 3 - 3)/2 -|- 1/n]. 


Proof. Let us assume that we use an estimator that, for each time step fc, enumerates all the possible sequences 
of the step sizes Now, for any sequence of step sizes (a'[t])^^Q G the consistent parameter sets 

are given by the mapping Z[k] {a'[i\%Q\ = {iQ'x[t] + q'+J2T=i d,-c.x[fc] = 0, 

Similar to the proof of Theorem [2 -Z[fc; (a'[i])fTQ^] is either a singleton or the empty set if fc -|- 1 > ((n -I- 
l)n/2 -I- n -I- l)/n = (n 3- 3)/2 3- 1/n. This is true since, given the sequences of the step sizes, the number of 
the equations is larger than or equal the number of free variables. Now, if we get one more measurement, 
i.e., k = [(n 3- 3)/2 3- 1/n], only one of these sets remain nonempty. To be able to use Assumption [51 we 
should have [(n 3- 3)/2 3- 1/n] = k < n — 1, which gives n > 6. □ 


Remark 14. In the interior point method, the algorithm automatically shrinks the scaling factor X to extract 
the optimal point. If the rule for shrinking the scaling factor is known, one can use the Bayesian filter above 
to estimate the parameters of the utility function. 
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Note that, similar to the previous section, we may introduce change of variables to reconstruct the set 
Ad[A:], and subsequently the utility function, in polynomial time. To do so, note that we may alternatively 
define M.\k\ as 


M[k] = |(Q',g',A') xM" X A|3^[t] G : 

m . 

I3[t]y[t] = Q'x[t] +q' + X'Y^ d- = 0,..., fcI. 

i—1 ^ ■* 

This way, we get a set of linear equations, which as showed in Theorem [4] admits a unique solution for 
k > [(n + 3)/2+l/n]. 

Alternatively, we can use independently selected random step sizes at each agent to render the problem 
of reconstructability impossible. In this case, the update gradient algorithm becomes 


x[k + 1] = x[k] — A[k] 


Qx[k] + g + ^ 
i=l 


A 

di 



, x[0] G 


where A[k] € contains the stochastically-varying agent-dependent step sizes. Therefore, an eavesdrop¬ 

per that listens to consecutive iterations can construct the measurements 


y[k] = A[k] [ Qx[k] + g + T!' 


A 


(7- 

^ di - Cix[k] ® 
Now, employing the Bayesian filter, the estimator can deduce that 


p{Q',><'\{x[t\,y[t\)t=o) oc 1{Q',q',\')^M[k]p{Q' 


where 

M[k] = |(Q', g'. A') G 5^ X X A I 3A[t] G : y[t] = A[t] (Q'x[t] + <?' + ^' E , Vt = 0,..., fc 

Now, we may prove the following impossibility results. 


Theorem 5. The cardinality of the set AI[fc] is uncountably infinite for all k G Z>o. 

Proof. The proof is similar to that of Theorem [3] □ 
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5. Conclusions 


In this paper, we studied the problem of how to keep a utility function confidential even when the network 
over which the function is being optimized is compromised. Particularly, we considered the problem where 
an eavesdropper’s objective is to reconstruct a quadratic utility function via measuring the decision variable 
iterations under a gradient method. We considered the impact of different choices of the step size on the 
reconstructability of the utility function and showed that for the case that the step size is not constant 
and is selected randomly from a sufficiently large set of appropriate candidates, it is virtually impossible for 
an eavesdropper to reconstruct the utility function. Therefore, the best design recommendation is to add 
time-varying agent-dependent random step sizes to the implemented dynamics. In addition to time-varying 
random step sizes, there are other ingredients that matter, such as having a uniform random direction for the 
initial condition and not executing too many gradient descent steps (since if the number of steps is below a 
threshold the solution cannot be uniquely determined even if having access to extraordinary computational 
capabilities). An interesting avenue for future research could be to devise a tractable algorithm that can 
approximate the utility function and bound the accuracy of the approximation based on the statistics of 
the step size selection method. This can be done by using set-membership identification techniques (see 
Remark El) for bounding the difference between the identified and the true set of permissible parameters. 
Another avenue for future research could be to also study quasi-Newton or Newton methods because they 
require fewer iterations for converge and, thus, potentially minimize the amount of the leaked information. 
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Appendix A. Convergence of Gradient Algorithm with Agent-Dependent Step Sizes 


To present the results of this appendix, we need to introduce some notation. For a matrix A, we write 
A < 0 if A is a symmetric negative semi-definite matrix. Further, for matrices A and B of the same dimension, 
we write A<HifA — i?<0. 

Note that we follow the update rule x[k -f 1] = x[k] — A[k]{Qx[k] + q), where A[k] is the matrix of step 
sizes satisfying cil < A[k\ < C 2 I for all k with 0 < Ci < C 2 . In the reminder of this appendix, we determine 
conditions on ci, C 2 that guarantee the convergence of the gradient algorithm. These iterates converge to the 
maximizer of the utility function so long as the they satisfy Wolfe’s conditions; see Theorem 3.2 in [^, p. 38]. 
From Wolfe’s conditions, for 0 < ei < e 2 < 1, we have 


(a;[A:] - A[k]{Qx[k] + q))^Q[x[k]—A[k]{Qx[k] -\- q)) q^[x[k] — A[k]{Qx[k] + q)] — x[k]^Qx[k] — q^x[k] 

< —ei{Qx[k] + qY A[k]{Qx[k] + q), (A.la) 

{Qx[k] qYA[k]{Q{x[k]—A[k]{Qx[k] q)) + q) < e 2 {Qx[k] + qYA[k]{Qx[k] q). (A.lb) 


We may rewrite (lA.lal) and (lA.lbl) , perspectively, as 


A[k] 2 Qx[k] 

T ■ 

A[k]^q 



A[A:]5QA[fc]2 +(ei -2)/ 
2 ei - 3 


-I + A[k]^QA[k]^^ 


+ A[k\^QA[k]h 
A[k]^QA[k]^ + (ei - 1)/ 


A[k] 2 Qx[k] 
A[k]^q 


2 
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<0, 














and 


T 


{1 - 62)1 - A[k]^QA[k]^ 


A[k]^Qx[k] I 

A[k]iq I 

These conditions are satisfied if the following inequalities hold 


(£1-2)7 


1 

{A[k]--QA[k]--) 

I 

"■-h 

+ 

1 

I 



A[k] 2 Qx[k] 
A[k]^q 


< 0 , 


{1-62)1-A[k]^QA[k] 2 <0. 


For (IA.2al) to hold, it is sufficient to satisfy 


(ei-2)/ 


2ei -3 


I (ei-1)/ 


+ C2A max (Q) 


I I 
I I 


< 0 , 


< 0 . 


(A.2a) 

(A.2b) 


(A.3) 


where Amax(Q) denotes the largest eigenvalue of Q. Using Schur’s complement, we can translate (IA.3I) to 


Cl — 2 + C2A max (Q) < 0, 

1 “t” C2Aniax(Q)^ 2 + C2Aniax(Q)^ ^ C2Aniax(Q)^ ^ 0- 

This holds if ei — 2 + C 2 Amax(( 5 ) < 0. For (IA.2bl) to hold, it is sufficient to have 1 — £2 — ciAniin(Q) < 0, where 
Amin(Q) denotes the smallest eigenvalue of Q. Therefore, for the gradient algorithm to converge, it suffices 
to select Cl > (1 - e2)/Amin(Q) and C2 < (2 - e2)/Amax(Q) for 0 < ei < £2 < 1 - 
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